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Abstract 

Background: Apomixis, asexual seed production in plants, holds great potential for agriculture as a means to fix 
hybrid vigor. Apospory is a form of apomixis where the embryo develops from an unreduced egg that is derived 
from a somatic nucellar cell, the aposporous initial, via mitosis. Understanding the molecular mechanism regulating 
aposporous initial specification will be a critical step toward elucidation of apomixis and also provide insight into 
developmental regulation and downstream signaling that results in apomixis. To discover candidate transcripts for 
regulating aposporous initial specification in P. squomulotum, we compared two transcriptomes derived from 
microdissected ovules at the stage of aposporous initial formation between the apomictic donor parent, P. 
squomulotum (accession PS26), and an apomictic derived backcross 8 (BC 8 ) line containing only the Apospory- 
Specific Genomic Region (ASGR)-carrier chromosome from P. squomulotum. Toward this end, two transcriptomes 
derived from ovules of an apomictic donor parent and its apomictic backcross derivative at the stage of apospory 
initiation, were sequenced using 454-FLX technology. 

Results: Using 454-FLX technology, we generated 332,567 reads with an average read length of 147 base pairs 
(bp) for the PS26 ovule transcriptome library and 363,637 reads with an average read length of 142 bp for the BC 8 
ovule transcriptome library. A total of 33,977 contigs from the PS26 ovule transcriptome library and 26,576 contigs 
from the BC 8 ovule transcriptome library were assembled using the Multifunctional Inertial Reference Assembly 
program. Using stringent in silico parameters, 61 transcripts were predicted to map to the ASGR-carrier 
chromosome, of which 49 transcripts were verified as ASGR-carrier chromosome specific. One of the alien 
expressed genes could be assigned as tightly linked to the ASGR by screening of apomictic and sexual F n s. Only 
one transcript, which did not map to the ASGR, showed expression primarily in reproductive tissue. 

Conclusions: Our results suggest that a strategy of comparative sequencing of transcriptomes between donor 
parent and backcross lines containing an alien chromosome of interest can be an efficient method of identifying 
transcripts derived from an alien chromosome in a chromosome addition line. 



Background 

Apomixis, asexual reproduction through seed, is wide- 
spread among flowering plant families, but low in its fre- 
quency of occurrence [1]. Different from sexual 
reproduction, apomictically derived embryos develop 
autonomously from unreduced ovular cells instead of 
through fertilization of a reduced egg by a sperm. There- 
fore, the progeny of an apomictic plant are genetically 
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identical to the maternal plant [2,3]. This trait can be 
used as an advanced breeding tool in agriculture since it 
would enable fixation of hybrid vigor and seed propaga- 
tion of desirable genotypes [4-7]. No major agriculturally 
important crop possesses this trait [8-10]. Introgression 
of apomixis into crops through crossing has been 
impeded by factors such as polyploidy and incompatibil- 
ity [9]. Therefore, discovery of genetic mechanisms 
underlying apomixis will be crucial for manipulation of 
apomixis for introduction into target crops. 

Apomixis has been classified into two types and three 
developmental pathways: gametophytic apomixis, 
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including apospory and diplospory, and sporophytic 
apomixis, which is also known as adventitious embryony 
[2]. In sporophytic apomixis, an embryo forms directly 
from an ovular cell and coexists with the zygotic 
embryo. For gametophytic apomixis, the embryo devel- 
ops from an unreduced egg in an embryo sac derived 
through mitosis of either a somatic nucellar cell (aposp- 
ory) or the megaspore mother cell (diplospory). In 
apospory, meiosis either does not complete or its pro- 
ducts degenerate while aposporous initials (AIs) develop 
from one or more somatic nucellar cells. Both genotypes 
chosen for the present study are aposporous with the 
trait conferred by genetic elements from Pennisetum 
squamulatum. Aposporous P. squamulatum has four- 
nucleate embryo sacs that lack antipodals [10]. Aposp- 
ory in this species is inherited as a dominant Mendelian 
trait [11] and is associated with an approximately 50 
Mb, heterochromatic and hemizygous chromosomal 
region designated the Apospory-Specific Genomic 
Region (ASGR), [12,13]. 

Many transcriptional approaches to discover the regu- 
latory mechanisms and downstream effects associated 
with apomixis in many species have been undertaken. In 
Brachiaria, differential display applied to apomictic and 
sexual ovaries at anthesis yielded two apomixis-specific 
fragments [14] while a study on earlier sporogenesis and 
gametogenesis stages identified eleven differentially 
expressed fragments [15]. In Paspalum notatum, three 
expressed sequence tags (ESTs), all highly similar in 
sequence, showed differential expression in flowers 
between apomictic and sexual F x individuals after aposp- 
ory initiation [16]. An additional 65 genes were identi- 
fied as differentially expressed between sexual and 
aposporous plants [17]. cDNA-AFLP analysis in Paspa- 
lum simplex yielded transcripts linked to the apomixis- 
controlling locus (ACL). Many of these linked fragments 
showed stop codons and frameshift mutations, suggest- 
ing that they are pseudogenes [18]. cDNA-AFLP was 
also applied to identify apomixis candidate genes in Poa 
pratensis where 179 transcript-derived fragments from 
spikelets showed qualitative and quantitative expression 
differences between apomictic and sexual genotypes 
[19]. The full-length sequences of two genes of interest, 
PpSERK (SOMATIC EMBRYOGENESIS RECEPTOR- 
LIKE KINASE) and APOSTART were obtained and their 
temporal and spatial expression patterns were assessed 
by reverse transcription polymerase chain reaction (RT- 
PCR) and in situ hybridization, respectively. While 
neither one of these two candidate genes showed apo- 
mixis- or sexual-specific expression, quantitative differ- 
ences in expression between apomictic and sexual 
genotypes were observed [20]. 

One apomixis-specific gene was identified from a 
Panicum maximum ovule cDNA library and shown to 



be expressed in both aposporous initials and embryos at 
four days after anthesis [21,22]. Additional genes have 
been identified in Panicum through microarray and 
quantitative RT-PCR analysis [23]. In Pennisetum ciliare, 
differential display and suppression subtractive hybridi- 
zation were used to identify gene expression differences 
in ovaries of sexual and apomictic accessions [24,25]. 
SuperSAGE, a high-throughput differential display 
approach, has been used to discover several hundred 
transcripts with heterochronic shifts in expression 
between apomictic and sexual ovules at multiple stages 
of development [26,27]. 

Formation of aposporous initials is the first and most 
critical event for occurrence of apospory. Because the 
initiation of sexual and apomictic pathways likely is 
activated by different signals [28], understanding the 
molecular mechanism underlying apospory initiation 
can provide insight into developmental regulation and 
downstream signaling that results in apomixis. In 
order to discover candidates for regulating aposporous 
initial specification in P. squamulatum, we compared 
two transcriptomes derived from microdissected ovules 
at the stage of aposporous initial (AI) formation 
between the apomictic donor parent, P. squamulatum, 
and its apomictic derivative backcross 8 (BC 8 ) contain- 
ing a single P. squamulatum chromosome. Initially, a 
P. glaucum x P. squamulatum Y x was crossed with a P. 
glaucum x P. purpureum F 2 and hybrid apomictic indi- 
viduals with good male fertility were selected [29]. 
Subsequent backcrosses with tetraploid P. glaucum 
[30] yielded a BC 8 line that was shown by FISH to 
contain only one chromosome from P. squamulatum. 
This single chromosome common to both apomictic 
BC 8 and P. squamulatum was the ASGR-carrier chro- 
mosome based on the transmission of the trait of apo- 
mixis and linked molecular markers [31]. We 
hypothesize that candidate genes regulating aposporous 
initial specification and localized to the ASGR will 
function in both PS26 and BC 8 at the same develop- 
mental stage and would be identical in sequence as 
they are related by descent. 

The development and commercialization of new mas- 
sively parallel sequencing platforms have made tran- 
scriptome sequencing faster and more affordable. One 
platform, developed by 454 Life Sciences Corporation, 
the 454 GS-FLX sequencer, is capable of producing 100 
Mb of sequence data with an average read length of 250 
bp per bead in a 7-h run [32]. Successful applications of 
these high-throughput sequencing technologies to tran- 
scriptome analysis have been reported [33-37]. Here we 
present expressed sequence tags (ESTs) generated by 
Roche 454 high-throughput sequencing technology from 
dissected ovule tissues staged for aposporous initial 
formation from two apomictic lines chosen for their 
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common features of apospory and single shared chro- 
mosome. Alien chromosome (ASGR-carrier chromo- 
some) expressed transcripts were identified and tested 
for ASGR linkage and tissue expression. 

Results 

Aposporous ovule-enriched cDNA samples for sequencing 

Ovules from PS26 and BC 8 around the stage of apospor- 
ous initial formation were manually dissected from pis- 
tils (Figure 1). Three biological replicates of 40 ovules 
each were collected for both PS26 and BC 8 . The yield of 
total RNA from each replicate was approximately 20 ng 
from which 15 ng was used for one-round of T7 RNA 
polymerase-based RNA amplification. The average yield 
from one round of amplification was 90 \ig. For each 
library, equal amounts of amplified RNA from each 
replicate were combined and 15 (ig amplified RNA was 
used for ds-cDNA synthesis. The majority of the ds- 
cDNA synthesized from amplified RNA was distributed 
in a size range from 200 bp to 1000 bp (Figure 2). 

Assembly of sequences from PS26 and BC 8 aposporous 
ovules 

Two aposporous ovule transcriptomes, one from PS26 
and the other from BC 8 , were sequenced using the 
high-throughput 454-FLX sequencer. The PS26 tran- 
scriptome library contained 332,567 reads with an aver- 
age read length of 147 base pairs (bp) and the BC 8 
transcriptome library contained 363,637 reads with an 
average read length of 142 bp. Assembly by the Multi- 
functional Inertial Reference Assembly (MIRA) program 
[38] resulted in 33,977 contigs from the PS26 ovule 
transcriptome library and 26,576 contigs from the BC 8 
ovule transcriptome library (Additional file 1: 
PS26_MIRA.fasta, Additional file 2: BC8_MIRA.fasta). 
The number of reads per contig ranged from 1 to 759 
in PS26 assemblies and 1 to 1661 in BC 8 assemblies 
with the majority having less than 30 reads per assembly 
in both cases. The numbers of singletons in PS26 and 
BC 8 libraries were 176 and 78, respectively. 




Figure 1 Microdissection and ovary clearing, a: cleared ovary 
showing no aposporous initials and prior to megasporogenesis. b: 
cleared ovary showing two aposporous initials, indicated by solid 
arrows. 
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Figure 2 Agilent Bioanalyzer 2100 analysis result of the ds- 
cDNA samples. 

k. J 



Blast2GO 

Contigs from both transcriptome libraries were analyzed 
for biological functions using Blast2GO [39]. For both 
libraries, the use of T7 amplified RNA biased the 
sequencing data toward the 3' UTR region as shown by 
the BlastX results of the Blast2GO analysis. 5,730 PS26 
contigs (-17%) and 4,833 BC 8 contigs (-18%) had hits 
against the nr database of NCBI with an E-value cut-off 
of e" 06 . For both libraries, 90% of the top BlastX hits 
were, in order, to Sorghum bicolor, Zea mays or Oryza 
sativa proteins. Blast2GO was able to fully annotate 
4,400 PS26 contigs and 3,692 BC 8 contigs (Figure 3). 

To obtain additional functional data from the shorter 
reads, a study was initiated to test whether the most sig- 
nificant BlastN EST_other database hit (E-value cut off 
of e" 20 ) could be used as a surrogate longer sequencing 
read for the PS26/BQ transcripts. Approximately 55% 
(14,518) of the BC 8 contigs had an EST_OTHERS hit 
<e~ 20 . Blast2GO analysis was used for the BC 8 _EST_- 
OTHERS best matches and compared with Blast2GO 
mapping results for the 3692 annotated BC 8 contigs. 
The majority (84%) of the BC 8 contigs had Blast2GO 
mapping data identical to the corresponding BC 8 _EST_- 
OTHERS mapping data while only 5% of the BC 8 con- 
tigs had >50% non-matching mapping data. Given the 
large percentage of identical and/or highly matching 
mapping data, a library of PS26_EST_OTHERS was also 
established using the same parameters as BC 8 _EST_- 
OTHERS. Approximately 53% (18,028) of the PS26 con- 
tigs had an EST_OTHERS hit <e" 20 . Blast2GO was able 
to fully annotate 12,462 PS26_EST_OTHERS contigs 
and 10,107 BC 8 _EST_OTHERS contigs. 

A Fishers Exact Test (using GOSSIP; [40]) was done 
to identify significant differences of expression data 
between the PS26 and BC 8 libraries and the PS26_EST_- 
OTHERS and BC 8 _EST_OTHERS libraries. At a false 
discovery rate (FDR) <0.01, 28 GO terms were identified 
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Figure 3 Blast2GO Level 3 biological processes for PS26 and BC 8 . 
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as different between the PS26 and BC 8 libraries (Table 
1). However, when the PS26_EST_OTHERS and 
BC 8 _EST_OTHERS libraries were compared at FDR 
<0.05 (at an FDR <0.01 no significant results were 
returned), only 7 GO terms (ribosome, translation, ribo- 
some biogenesis, ribonucleoprotein complex biogenesis, 
ribonucleoprotein complex, structural constituent of 
ribosome, cellular component biogenesis) were identified 
as differentially expressed between the two libraries 
(Table 1). 

In Silico identification of putative alien expressed 
transcripts 

When MIRA-assembled contigs from the two libraries 
were analyzed by BlastN with PS26 sequences as queries 
and BC 8 sequences as the database, a total of 118 com- 
parisons were obtained with 100% sequence identity 
across an overlapping region >100 bp corresponding to 
115 unique contigs from the PS26 database and 116 
unique contigs from the BC 8 database. The 118 PS26/ 



BC 8 contigs were further analyzed by aligning the corre- 
sponding PS26 and BC 8 contigs with each other, result- 
ing in 61 inter-genotype contigs with no mismatches 
that were aligned. The average overlapping regions of 
the 61 inter-genotype contigs was 241 bp (ranging from 
181 bp to 419 bp) with an average number of 28 
sequence reads. The remaining PS26/BC 8 contigs, while 
initially identified by BlastN as having 100% identity 
over a region >100 bp, did not continue to share 
sequence similarity outside this region and therefore did 
not align over the whole contig. 

Mapping and predicted function of putative ASGR-carrier 
chromosome transcripts 

Up to four primer pairs per contig were used to test for 
linkage of the 61 contigs to the ASGR-carrier chromo- 
some. Sequence characterized amplified region (SCAR) 
primer pairs were designed based on the PS26 contig 
sequence (Additional file 3, Table SI). After screening 
by PCR against PS26, IA4X (4 x P. glaucum), N37 
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Table 1 GO terms within the biological process category significantly over- or under-represented between the 
libraries. 



GO TERM 
ID 


description 


Adjusted p-value 
for Ps26_contigs 
(FDR <0.01) 


Over/Under 
representation 
Ps26_contigs 


Adjusted p-value for 
Ps26_EST_OTHERS 
contigs 
(FDR <0.05) 


Over/Under 
representation 
Ps26_EST_OTHERS 
contigs 


GO:0005840 


ribosome 


1 .90E-05 


over 


1.10E-04 


over 


GO:0006412 


translation 


5.81 E-06 


over 


1 .47E-04 


over 


GO:0042254 


ribosome biogenesis 


6.05E-06 


over 


1.61 E-04 


over 


GO:0022613 


ribonucleoprotein complex 
biogenesis 


7.31 E-06 


over 


1.61 E-04 


over 


GO:0030529 


ribonucleoprotein complex 


2.47E-05 


over 


1.73E-04 


over 


GO:0003735 


structural constituent of 
ribosome 


6.02E-06 


over 


2.05 E-04 


over 


GO:0044085 


cellular component biogenesis 


9.23E-06 


over 


2.34E-04 


over 


GO:0043228 


non-membrane-bounded 
organelle 


1 .07E-05 


over 


n.s. 




GO:0043232 


intracellular non-membrane- 
bounded organelle 


1 .07E-05 


over 


n.s. 




uU.UUUj I yo 


structural molecule activity 


D.D /t-lO 


over 


n.s. 




UU.UUo4o4b 


cellular macromolecule 
biosynthetic process 


1 .52E-04 


over 


n.s. 




GO:0032559 


adenyl ribonucleotide binding 


1 .24E-05 


under 


n.s. 




GO:0005524 


ATP binding 


1 .46E-05 


under 


n.s. 




GO:0032553 


ribonucleotide binding 


1 .60E-05 


under 


n.s. 




GO:0032555 


purine ribonucleotide binding 


1 .60E-05 


under 


n.s. 




GO:0000166 


nucleotide binding 


5.36E-05 


under 


n.s. 




r-,n-nf)f)i 9.9.1 


1 lULIcUblUc uiiiuiiiy 


J.JUL Uj 


under 


n.s. 




oU.UU 1 /U/o 


purine nucleotide binding 


j.//t-Uj 


under 


n.s. 




GO:0001883 


purine nucleoside binding 


5.91 E-05 


under 


n.s. 




GO:0030554 


adenyl nucleotide binding 


5.91 E-05 


under 


n.s. 




GO:0003824 


catalytic activity 


9.79E-05 


under 


n.s. 




GO:00 16740 


transferase activity 


1 .38E-04 


under 


n.s. 




GO:0006793 


phosphorus metabolic process 


1 .67E-04 


under 


n.s. 




GO:0006796 


phosphate metabolic process 


1 .67E-04 


under 


n.s. 




GO:0006073 


cellular glucan metabolic 
process 


1.75E-04 


under 


n.s. 




GO:0044042 


glucan metabolic process 


1.75E-04 


under 


n.s. 




GO:0016773 


phosphotransferase activity, 
alcohol group as acceptor 


2.50E-04 


under 


n.s. 




GO:00 16310 


phosphorylation 


3.01 E-04 


under 


n.s. 





(P. purpureum) and a small number of progeny from 
apomictic BC 8 segregating for mode of reproduction, 45 
contigs showed specific amplification from PS26 and 
apomictic BC 8 but no amplification from IA4X or sexual 
BC 8 individuals (Figure 4, Table 2) establishing linkage 
of 45 contigs to the ASGR-carrier chromosome. Single- 
strand conformation polymorphism analysis (SSCP) and 
a CAPS screen using two to four restriction enzymes 
was applied to the 14 primer pairs which amplified pro- 
ducts in both PS26 and IA4X DNA. Four additional 
contigs could be linked to the ASGR-carrier chromo- 
some using SSCP analysis (Table 2). The CAPS screen 
identified a Hae\\\ polymorphism for PS26_c2552, a 
transcript also mapped by SSCP. 



The markers from the 49 ASGR-carrier chromosome- 
linked contigs were initially screened on a limited num- 
ber of apomictic (4) and sexual (4) Y x s for mapping to 
the ASGR. This resulted in one contig, PS26_c9369, 
showing tight linkage to the ASGR as the primers 
amplified DNAs from only apomictic Y x s but not sexual 
FiS (Figure 5, Table 2). The remaining primer sets did 
not show amplification specificity in the Y x population; 
both apomictic and sexual progeny amplified. 

A larger Y x population of 22 individuals (10 apomictic 
and 12 sexual) was used to map the PS26_c9369 and 
PS26_c2552 transcripts. PS26_c2552 was mapped based 
on the Haelll polymorphism found in the CAPS screen 
between PS26 and IA4X and also seen in the F x 
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Figure 4 Examples for mapping of transcripts to the ASGR- 

carrier chromosome, a: amplification from PS26, N37 and apomictic 

BQbut not from IA4X or sexual BC 8 (PS26_c583: p1510/p151 1). b: 

amplification from PS26 and apomictic BQbut not from IA4X, N37 or 

sexual BC 8 (PS26_c9369: p1514/p1515). c: amplification from PS26, 

IA4X, N37 and both apomictic and sexual BC 8 (no specificity; 

PS26_c4364: p1 504/p1 505). Specificity for PS26_c4364 subsequently 

was achieved by using a different primer pair (Table 2). 
v . ) 



population. PS26_c2552 is unlinked to the ASGR as the 
CAPS polymorphism segregated 1:1 in the population 
but with 7 sexual and 5 apomictic individuals containing 
the marker. In comparison, the PS26_c9369 primers 
remained specific to the 10 apomictic plants and did not 
amplify the 12 sexual plants. 

BlastX searches against NCBI databases were carried 
out for the 49 PS26/BC 8 ASGR-carrier chromosome 
linked contigs and best protein hits for 18 contigs are 
summarized in Table 3. Because the sequences are 3' 
biased, a BlastN analysis against the expressed sequence 
tag (EST_OTHERS) database at NCBI with the remain- 
ing 31 PS26/BC 8 contigs was done to find potential 
orthologs from other species. At an E-value cutoff of e" 
20 , 18 contigs had EST hits (Table 3). A BlastX was per- 
formed using these EST sequences to determine if tenta- 
tive protein functions could be obtained, and the best 
hits are listed in Table 3. The remaining 13 (27%) con- 
tigs did not have hits by either BlastX or BlastN; there- 
fore, they were considered orphan genes. 

In order to generate contiguous sequence that might 
enhance the potential for mapping of contigs in the Y Y 
population and to extract a longer cDNA sequence for 
PS26_c9369, a cDNA library containing -300,000 phage 
plaques was constructed from apomictic BC 8 mature 
ovary and anther RNA since all 49 ASGR-carrier chro- 
mosome transcripts showed expression in these tissues 
by RT-PCR. Screening of the cDNA library with 27 
ASGR-carrier chromosome transcript probes yielded 
hybridization signals for 24 probes. PCR screening with 
the ASGR-carrier chromosome-specific primers identi- 
fied 16 ASGR-carrier chromosome clones and one clone 
for PS26_c9369. Additional sequence for these clones 
was generated. 



The PS26_c9369 clone contained a 646 bp insert. 
BlastX analysis identified similarity to a hypothetical 
protein SORBIDRAFT_10g020450 (XP_002438482.1; e- 
value 6e~ 18 ) and Oryza sativa hypothetical protein 
OsJ_30933 [EAZ15525.1; e-value 4e~ 16 ] over an -155 bp 
region. In both sorghum and rice, the area of similarity 
overlapped a pfam03004: Transposase_24 domain for 
those proteins. The remaining PS26_c9369 clone 
sequence was unique. Nine primer sets were designed 
from nine PS26 contigs to span introns based on pre- 
dicted splicing of best hits to sorghum. Five primer sets 
gave strong amplification of PS26 genomic DNA. These 
amplicons were cloned and sequenced to identify SNPs 
within the PS26 genomic alleles. CAPS markers could 
be designed for PS26_cl580 (HpyCH^lY) and 
PS26_c33813 (//^CH4IV). Mapping of 4 apomictic and 
4 sexual FiS did not show tight linkage of these contigs 
to the ASGR. 

Expression profiles of ASGR-linked expressed transcripts 
by RT-PCR 

RT-PCR with RNA extracted from apomictic BC 8 leaf, 
root, anther, and ovary tissues was completed for the 49 
candidate genes mapped to the ASGR-carrier chromo- 
some. Forty-seven were expressed in all four organ types 
examined (Figure 6a). However, one putative MADS- 
domain containing transcription factor, corresponding 
to contig PS26_c33813, showed amplification only in 
anther and ovary tissues (Figure 6b) and contig 
PS26_cl0535, a putative Lon protease, showed expres- 
sion in all organs except anther. 

Discussion 

Transcriptional profiling has been extensively used for 
gene discovery in plants because the absence of introns 
greatly enhances the information content of the data set 
and eases data interpretation [41-43]. Combined with 
454 high-throughput sequencing technology, transcrip- 
tome sequencing has become an approach to under- 
stand molecular events at the gene expression level on a 
genome-wide scale. Many successful applications of 454 
sequencing technology in transcriptome sequencing and 
single nucleotide polymorphism (SNP) discovery have 
been reported [44-49] and supported our use of this 
technology for ovule transcriptome sequencing. 

In contrast to studies aimed at identifying genes 
involved in apomictic reproduction through the identifi- 
cation of differences between apomictic and sexual gen- 
otypes, our study compared two apomictic lines for 
identical transcripts. We previously reported that the 
ASGR is sufficient to induce apomixis in sexual pearl 
millet [11,12]; therefore, the trait of apomixis in BC 8 is 
conferred by the ASGR-carrier chromosome from PS26 
[31]. In the present study, we have attempted to identify 
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Table 2 Summary of mapping results 

PS26 contig name Primers size PS26 IA4X N37 Transcripts mapped Transcripts mapped 

to the ASGR-carrier as tightly linked to 

chromosome the ASGR locus 



PS26 c9369 


1514/1515 


274 


+ 






Yes 


Yes 


PS26 rl 0331 

1 ~J Z-\J V^. 1 WJ J 1 


1476/1477 


210 


4- 






Yes 


nn 
i ip 


PS?6 ri39?? 


1486/1487 


200 


-)- 






Yes 


n n 
up 


PS?6 r5080 


1 506/1 507 

I ~J \J\J/ I JU / 


204 


~\- 






Yes 


n n 
i ip 




1 744/1 745 CAPS 


800 


+ 


+ 


N/A 


Yes 


No 


PS26_c2339 


1528/1529 


213 


+ 


- 


- 


Yes 


np 


PS26_c2785 


1534/1535 


226 


+ 


- 


- 


Yes 


np 


PS26_c194 


1604/1605 


283 


+ 


- 


- 


Yes 


np 


PS26_c2838 


1642/1643 


103 


+ 


- 




Yes 


np 


PS26_c3609 


1646/1647 


150 


+ 






Yes 


np 


PS26_c5210 


1652/1653 


157 


+ 


- 




Yes 


np 


PS26_c6744 


1658/1659 


202 


+ 


- 


- 


Yes 


np 


PS26_c5851 


1654/1655 


179 


+ 


- 


- 


Yes 


np 


PS26_c1406 


1583/1681 


250 


+ 






Yes 


np 


PS26_c28392 


1 704/1 705 


181 


+ 


- 




Yes 


np 


PS26_c4364 


1505/1716 


150 


+ 


- 




Yes 


np 




1504/1505 


140 


+ 


+ 


+ 


np 


np 


PS26_c1 1544 


1478/1479 


165 


+ 


- 


+ 


Yes 


np 


PS26_c13157 


1480/1481 


161 


+ 


_ 


+ 


Yes 


np 


PS26_c13655 


1482/1483 


214 


+ 


_ 


+ 


Yes 


np 


PS26_c1372 


1484/1485 


215 


+ 




+ 


Yes 


np 


PS26_c2448 


1492/1493 


189 


+ 




+ 


Yes 


np 


PS26 r30691 


14Q8/14QQ 


206 


-\- 




-)- 


Yes 


n n 
i ip 


PS26 r3546 


1 500/1 501 

I ~J \J\J/ 1 JU 1 


245 


-\- 




-)- 


Yes 


n n 
i ip 


PS26 c583 


1510/1511 

1 J 1 \J/ 1 J 1 1 


212 


~\- 




4- 


Yes 


nn 
i ip 


PS26 c8165 


1512/1513 


1 50 


4- 




4- 


Yes 


nn 
i ip 


PS26 c1279 


1 530/1 531 

1 J J \J/ 1 J J 1 


228 


4- 




4- 


Yes 


nn 
i ip 


PS26 r7587 


1532/1533 


1 72 


4- 




4- 


Yes 


n n 
i ip 


PS26 r17388 

1 ~J Z-\J V^. 1 / JUU 


1 538/1 539 

I ~J J D/ 1 JJ J 


163 


-)- 




4- 


Yes 


n n 
i ip 


PS26 c3455 

1 ~J Z-\J V >"JJ 


1 540/1 541 

1 ~)\\J/ 1 JT 1 


102 


4- 




4- 


Yes 


nn 
i ip 


PS26 rl 31 2 


1 542/1 543 


143 


4- 




4- 


Yes 


nn 
i ip 


PS26 c338 


1 548/1 549 

1 J iU/ 1 J i ^ 


120 


4- 




4- 


Yes 


nn 
i ip 


PS26 r33813 


1 565/1 566 


140 


~\- 




4- 


Yes 


n n 
i ip 




1 724/1 725 CAPS 


900 


+ 


+ 


N/A 


Yes 


No 


PS26_c1422 


1567/1568 


120 


+ 




+ 


Yes 


np 


PS26_c6131 


1571/1572 


179 


+ 


- 


+ 


Yes 


np 


PS26_c2388 


1575/1576 


128 


+ 


_ 


+ 


Yes 


np 


PS26_c32589 


1581/1582 


216 


+ 


_ 


+ 


Yes 


np 


PS26_c10535 


1630/1631 


148 


+ 


_ 


+ 


Yes 


np 


PS26_c2807 


1640/1641 


164 


+ 




+ 


Yes 


np 


PS26_c9776 


1664/1665 


170 


+ 




+ 


Yes 


np 


PS26_c6373 


1656/1657 


178 


+ 




+ 


Yes 


np 


PS26_c1878 


1690/1691 


157 


+ 




+ 


Yes 


np 


PS26_c19109 


1692/1693 


163 


+ 




+ 


Yes 


np 


PS26_c22381 


1696/1697 


246 


+ 




+ 


Yes 


np 


PS26_c4150 


1650/1715 


450 


+ 




+ 


Yes 


np 


PS26_c704 


1 708/1 709 


155 


+ 




+ 


Yes 


np 


PS26_c3993 


1502/1713 


800 


+ 




+ 


Yes 


np 


PS26_c30198 


1 496/1 497 sscp 


210 


+ 


+ 


+ 


Yes 


np 


PS26_c1472 


1 573/1 574 sscp 


185 


+ 


+ 


+ 


Yes 


np 
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Table 2 Summary of mapping results (Continued) 



rbZO_CZDDZ 


I O/U/ I 0/ I 


Z43 


+ 


+ 


+ 


Yes 


NO 


r DZO_C I 4o I O 


1 AAA/1 AA7SSCP 
I ODD/ I 00/ 


1 7c: 


+ 


+ 


+ 


res 


np 


rDzo_co i yz 


1 7~>n/i v) 1 
I /ZU/ I / Z I 


1 /in 
1 4U 


+ 


+ 


+ 


np 


np 


rDZo_CzUy4Z 


1 /I QQ /I A QO 

I 4oo/ I 4oy 


I ZD 


+ 


+ 


+ 


np 


np 


rbZD_CZ4oU 1 


1 /ion/1 /iqi 

1 4yu/ 1 4y i 


\ ZU 


+ 


+ 


+ 


np 


np 


PS26_c25664 


1494/1495 


193 


+ 


+ 


+ 


no 


no 


PS26_c5781 


1508/1509 


156 


+ 


+ 


+ 


np 


np 


PS26_c2405 


1577/1578 


180 


+ 


+ 


+ 


np 


np 


PS26_c15085 


1579/1580 


120 


+ 


+ 


+ 


np 


np 


PS26_c1580 


1628/1629 


237 


+ 


+ 


+ 


np 


np 


PS26_c18163 


1632/1633 


169 


+ 


+ 


+ 


np 


np 


PS26_c3656 


1648/1649 


152 


+ 


+ 


+ 


np 


np 


PS26_c21597 


1668/1669 


150 


+ 


+ 


+ 


np 


np 


PS26_c8378 


1662/1663 


199 


pf 


pf 


pf 


N/A 


N/A 



The one contig mapped to the ASGR is shown in bold. +: positive amplification; -: no amplification; pf: primer failure; np: no polymorphism available for 
mapping; N/A: not assayed. Primer sequences and annealing temperatures can be found in Additional file 3 - Table SI. 



candidate genes regulating the first step of apomixis, 
aposporous initial development, by transcriptome analy- 
sis of ovules from both PS26 and BC 8 . The ovules were 
collected at the stage of aposporous initial development, 
which ranged from no apparent apospory initials (-70%) 
to distinct aposporous initials observed (-30%). By pool- 
ing ovules over this range of development our objective 
was to minimize the chance of missing genes involved 
in the pathway of apomixis initiation since we would 
predict transcription prior to, and perhaps beyond, 
apospory initial formation. 

The two ovule transcriptomes generated had an aver- 
age read length of -150 bp, shorter than the average 
read length of 200-300 bases for the 454 GS FLX 
sequencer. The shorter than expected reads could have 
been due to a combination of factors in preparing the 
samples for sequencing such as the T7-based antisense 
RNA amplification method, the conversion of antisense 




Figure 5 Examples for mapping of transcripts to the ASGR a: 

amplification of apomictic F 1 s but not sexual F 1 s (PS26_c9369: 

pi 51 4/pl 515). b: amplification of both apomictic F1s and sexual F 1 s 

(PS26_c5080: pi 506/p1 507). 

v / 



RNA to cDNA, or during the shearing process of the 
cDNA to prepare the sequencing library. Another possi- 
ble factor is the species itself. It has been shown that 
the average read length can vary among different organ- 
isms due to differences in AT/GC content [32]. 

Even with short reads and using stringent comparison 
conditions to decrease the number of false positive joins 
between highly similar but not identical transcripts from 
the two species, 61 putative ASGR-carrier chromosome 
candidate expressed genes were identified in silico, of 
which 49 have confirmed linkage to the ASGR-carrier 
chromosome. The 3' bias of the T7 amplified transcripts 
helped in the design of primers to discriminate between 
P. squamulatum and the BC 8 pearl millet genome con- 
taining one P. squamulatum chromosome. Our sequen- 
cing strategy helped remove, at least to a chromosomal 
level, the difficulties associated with candidate gene 
identification by comparative gene expression analysis in 
apomictic and sexual systems which lack, due to the 
apomictic process, an ability to generate isogenic lines 
that vary only in their mode of reproduction. Primer 
specificity for 48 transcripts was not seen when we 
attempted to map SCARs to the ASGR using a Y x popu- 
lation containing many P. squamulatum chromosomes. 
The additional sequence generated by the phage cDNA 
clones allowed mapping of two more transcripts in the 
Y 1 population. Greater sequence length would be advan- 
tageous for mapping of the ASGR-carrier chromosome 
transcripts to the ASGR locus. 

The use of the gene ontology software Blast2Go 
allowed comparison of both the PS26 and BC 8 libraries 
and the PS26_EST_OTHERS and BC 8 _EST_OTHERS 
libraries created by using the most significant EST_- 
OTHERS BlastN result as a surrogate for our sequences. 
The PS26 and BC 8 transcriptomes were almost identical 
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Table 3 Potential function of transcripts mapping to the ASGR-carrier chromosome based on BlastX or BlastlM. 



Ps26 
Contig 


BC 8 
Contig 


Overlap 

length 

(bp) 


BlastX 


BlastN (E-Value) to EST_OTHERS 


BlastX of EST hit in BlastN 
column 


Ps26_ 
C10331 


BC 8 _ 
C7991 


241 


no hit 


RCRST0_005870 Foxtail millet EC61 2643.1 
Gl:1 493621 18 (3e-55) 


no hit 


Pelf 
r SZO_ 

C11544 


Rr 

C10325 


119. 
ZZo 


no hit 


no hit 




Pelf 
rSZD_ 

C13157 


C5112 


111 
ZZ / 


no hit 


no hit 




Pelf 
r SZO_ 

C13655 


Rr 

C24571 


1 Q9 

i yz 


no hit 


nPAP Of FH9 Anemic-fir nictil P.hAOQ.A'^.lf 1 
prAr_UO_LUZ ApUllllLUL pibLII DIVIUO^O/O. I 

Gl:27532285 (8e-24) 


r\i i"t"a1"i\ /cs If^. r\rr\1~£2i^crvmi2i r\rvr\ 
pULdLIVt: ZOD piULcdbUlllc MUM 

ATPase regulatory subunit 3 
ACG34075.1 Gl:1 95624490 


Ps26_ 
C1372 


BC 8 _ 
C12789 


326 


no hit 


CCGC4364.g1 CCGC Panicum virgotum early 
floral buds + reproductive tissue FL750787.1 
Gl:1 98007657 (e-174) 


NADH-ubiquinone oxidoreductase 
51 kDa subunit NP_001 148767.1 
Gl:226532265 


Ps26_ 
C13922 


BC 8 _ 
C12833 


212 


no hit 


no hit 




Ps26_ 
C2448 


BC 8 _ 
C12858 


225 


no hit 


pPAP_10_F04 Apomictic pistil FL81 3942.1 
Gl:1 98086024 (2e-57) 


ankyrin protein kinase-like 
NP_001 152470.1 GL226495939 


Ps26_ 
C30691 


BC 8 _ 
c 10294 


206 


no hit 


no hit 




r bZO_ 

C3546 


Rr 

C8622 


1QR 


no hit 


no hit 




Pelf 
r bZO_ 

C5080 


Rr 

c 12542 


1 1 1 
Z 1 z 


1 lypUU IcllLdl piULclll \JbJ_Z a ty \ o 

EEE67490.1 Gl:222637358 






Ps26_ 

UOJ 


BC 8 _ 

CD I H I 


223 


no hit 


6XJF-rd_A11 pAPO Cenchrus ciliaris 

FRA^7Q3A 1 ClIfAl 07^8.1 (fa M7\ 
CDODZyDO. 1 K3\. 1 O^f 1 U/ JOZ \Okft-\Z/) 


SRC2 protein kinase C 

nhricnhrilinirlc kCC-,AC\' : l'\ f '\ 
-pMObpilOlipiUb AAL-VJ^-Uj I D. I 

Gl:1 95641 696] 


Pelf 
r bZO_ 

C8165 


Rr 

C5964 


1 r^ 

I CO 


no hit 


9.A7 IF nAPfl CenrhriK rilinrk FR^1ZR0 1 

Gl:1 641 23871 (7e-70) 


I rn. Al NOOK IllULM llULIcdl 

localized protein 2 FAA00302.1 
Gl:1 19657406 


Pelf 
r bZO_ 

C9369 


Rr 

C3452 


1 QH 


hypothetical protein OsJ_30933 
EAZ1 5525.1 Gl:1 25574241 






Pelf 

C2339 


Rr 

C7917 


If A 


no hit 


rrm9S5A7n1 (~CC-.(~-. Pnninim \/irnnfi 1m lato 
I Zo^f/ .y I LLuu rUiilLUiil VliyUiUiil Idle 

flowering buds FL81 2358.1 Gl:1 98084376 (e- 
23) 


F"sPZL fFMHAMrFn ^ll FMriMr 
PHENOTYPE 4) NP_1 95760.1 
Gl:1 5240970 


Pb26_ 
C1279 


BC 8 _ 
C8634 


243 


ENT domain containing protein 
ACG36577.1 Gl:1 95629872 






Ps26_ 
C7587 


BC 8 _ 
C11918 


202 


ATPNG1 {Arabidopsis Thaliana 
PEPTIDE-N-GLYCANASE 1) 
NP_1 99768.1 Gl:1 5240508 






Ps26_ 
C2785 


BC 8 _ 
C8847 


273 


ubiquitin-conjugating enzyme E2 
N NP_001 148361.1 GL226491078 






Ps26_ 
C194 


BC 8 _ 
C2920 


304 


no hit 


no hit 




Ps26_ 
C17388 


BC 8 _ 
C6454 


208 


histone 4 BAG68513.1 
Gl:1 95972757 






Pelf 

Y bZO_ 

C3455 


Rr 

C8607 


1 




IfY IF mi nAPO Cpnrhme rilinrk FRn^MM 1 
ZOA Jr kJJ I pnrU ^c7/C///Lo LlllUlIb HDD J J I D I . I 

Gl:1 641 98597 (e-102) 


ru it3ti\/o rrinrlonci nn 
pULdLIVc LUMUfcrl lblMLJ 

XP_002529162.1 GL255576542 


Ps26_ 
C1312 


BC 8 _ 
C3757 


313 


no hit 


25X_JF_D10 pAPO Cenchrus cilioris EB656417.1 
Gl:1 64027660 (2e-47) 


protein phosphatase 2A regulatory 
subunit A AAM94368.1 Gl:22296816 


Pb26_ 
c338 


BC 8 _ 
C3527 


419 


universal stress protein (USP) 
family protein NP_001 159067.1 
Gl:2594901 10 






Ps26_ 
C33813 


BC 8 _ 
C2708 


229 


putative MADS-domain 
transcription factor 
CAA70485.1 Gl:3851333 
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Table 3 Potential function of transcripts mapping to the ASGR-carrier chromosome based on BlastX or BlastN. 

(Continued) 



r bZO_ 

C1422 


Rr 

C3852 


1A^ 
ZHD 


no hit 


no hit 




Ps26_ 

LO I D I 


BC 8 _ 

LO?jJ 


224 


no hit 


CCHY9952.g1 CCHY Ponicum virgatum callus 

Fl 09.7^9.^ 1 n-1QR31Q497 Ip-ZLQ 1 ! 


putative calcium-dependent 

mrr^toi n Unaco ArrAA99H 1 
piULclll Kllldbc M^.O'H-OZZU. I 

Gl:1 95653505 


r bZO_ 

C2388 


Pv-8_ 

C2949 


ZU I 


no hit 


rAA/ftlll IF H03 nAPr) Cpnrhrnc rilinrk 
OVVOIII_Jr_riLO pnrU L-fcV ILiil Ub UllUilb 

EB662068.1 Gl:1 64227478 (3e-48) 


pUiyydldLLUIUl Idbe lIllllUllUi I 

precursor ACG36448.1 Gl:1 95629614 


Pc9^ 
r SZO_ 

C32589 


Rr 

C3672 


99Q 
ZZy 


no hit 


717 IF ROQ nAPO rpnrhrnc rilinrk FR^ZL3^H 1 

Gl:1 63993222 (5e-93) 


y~\ i //2i ry^i i r~ rr-vt" i irM \\a c r\r~\-2i'\'Cir\ 
pULdLIVc: 1 1 IILi ULUUUIe dbbULId LeU 

protein CAD23 144.1 GI37776903 


r bZO_ 

C10535 


Rr 

C22186 


1 R9 

I OZ 








r bZO_ 

C2807 


Rr 

c 12602 


9/L1 
ZH I 


no hit 


^Y IF AOft nAPO Cpnrhrnc rilinric FR^9RZLR 1 
JA_Jr_AUD pnrU K^tllLnlUb LlllUlIb LDODZO^-O. I 

Gl:1 641 80053 (6e-1 16) 


PK/^\cr^\K/^\^ i \r r\xy\\ i1"^co/ 
r I lUbpi lULjlULUI 1 IU Ldbc/ 

phosphomannomutase C terminal 
ABN08987.1 GM24361015 


r bZO_ 

C2838 


Rr 

Pv-8_ 

C3538 


1 R3 

I CO 


no hit 


no hit 




Pc9^ 

C3609 


Rr 

C10814 


9ZL^ 


no hit 


no hit 




r bZO_ 

C5210 


Rr 

C5192 


973 

Zl 3 


1 lypuil IcllLdl piUlclll WbJ_ZJU/ / 

EEE67565.1 GL222637433 






Ps26_ 


BC 8 _ 

r9Q9 
LZyZ 


257 


ADP-ribosylation factor 

RARQ03QA 1 n-9H1^1A79 
DADyUjyO. I VJI.ZU I O I Z 






Ps26_ 

\Jy 1 1 u 


BC 8 _ 


258 


no hit 


MK_7_78 Pennisetum glaucum seedlings 

rD796437 1 n-399779R4 (9p-ZliV) 


hypothetical protein 

^ORRIHRAFT n7nni0zlzin 

XP_002444160.1 GL242078783 


r bZO_ 

C5851 


Rr 

C5854 


1 Q9 
I yZ 


no hit 


no hit 




Pc9^ 

C6373 


Rr 

C6664 


93^ 
Zo J 


no hit 


1ZL7^97^ rFRF^ 1 Q7 7ph mn\ic Fl ZLm^77 1 
I H/ DZ/O LXnEO I y 1 ZcU niUyb rL^J ID//. 1 

Gl:21 1043870 (2e-41) 


I lypULI IcLILdl piULclll LUL IUUZ/Ojjj 

NP_001 143786.1 Gl:226505008 


Ps26_ 

r3m QR 
LOU I yO 


BC 8 _ 


220 


centromere/microtubule binding 

piULclll LUlJ, pULdLIVe 

XP_002523427.1 Gl:255564866 






Ps26_ 

r3QQ3 

LJ77J 


BC 8 _ 

r 1 


246 


fk506-binding protein, putative 

YP nn9^3A3^n 1 n-9^^^R7^Q3 
Ar_UUZJ j^jDU. I \3\.ZDDDO/ uyD 






Ps26_ 
c4364 


BC 8 _ 
c1 5332 


181 


no hit 


CCHZ9541.g1 CCHZ Panicum virgotum 
GDUz 1 5o4. 1 Gl: I yoobzz 1 4 (be-i I ) 


helix-loop-helix-like protein 
AAU/zb//. I GI:zybo/4Uy 


Ps26_ 
C1472 


BC 8 8_ 
C3819 


330 


small zinc finger-like protein 
AAD40002.1 GL5107180 






Ps26_ 
C1406 


BC 8 _ 
C4551 


221 


putative anther ethylene- 
upregulated protein ER1 

n a i — 7nnm 1 n.oTi a r r 1 o 

BAC79907.1 Gl:33l466l 9 






Ps26_ 
C1878 


BC 8 _ 
C7425 


242 


no hit 


CCGI41 93.g1 CCGI Panicum virgatum 
FL856163.1 Gl:1 981 281 93(3e-69) 


hypothetical protein 
SORBIDRAFT_02g036200 

vn r\r\"> a r r\n c r\ 1 r~\.~\ a ">r\A mro 

XP_002460850.l Gl:242045958 


Ps26_ 
C19109 


BC 8 _ 
C9186 


205 


no hit 


no hit 




Ps26_ 
C22381 


BC 8 _ 
C547 


185 


no hit 


2X6IIIJF-rd_A1 1 pAPO Cenchrus ciliaris 
EB652659.1 Gl:1 64076750 (7e-58) 


APx2 - Cytosolic Ascorbate 
Peroxidase ACG41 151.1 
Gl:1 95643366 


Ps26_ 
C28392 


BC 8 _ 
C12100 


230 


no hit 


no hits 




Ps26_ 
C4150 


BC 8 _ 
C3261 


276 


rRNA-processing protein EBP2, 
putative XP_002526440.1 
GL255570978 






Ps26_ 
C704 


BC 8 _ 
C1322 


368 


26S protease regulatory subunit, 
Putative XP_00252621 9.1 
Gl:255570523 
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Table 3 Potential function of transcripts mapping to the ASGR-carrier chromosome based on BlastX or BlastN. 

(Continued) 



Ps26_ 


BC 8 _ 


384 


40S ribosomal protein S6 


C2552 


C1808 




ACG31 980.1 Gl:1 95620300 


Ps26_ 


BC 8 _ 


366 


triose phosphate/phosphate 


C14318 


C14583 




translocator ACG33816.1 








Gl:1 95623972 



Potential functions of the inter-genotype contigs sharing 100% identity between PS26 and BC 8 ovule transcripts which could be mapped to the ASGR-carrier 
chromosome based on the best hit of the contigs to protein (BlastX) or nucleotide (BlastN) sequences in NCBI databases. 



on a level 3 biological process comparison. While many 
biological GO terms showed expression level differences 
when comparing the PS26 and BC 8 libraries, all but 
seven became non-significant when the PS26_EST_- 
OTHERS and BC 8 _EST_OTHERS libraries were com- 
pared. Six of the transcriptional differences noted 
belong to genes involved in either ribosomal or transla- 
tional functions. This difference may be caused by 
ploidy level difference of PS26 (an octoploid) and BC 8 (a 
tetraploid). MIR A assembly will separate alleles of genes 
into different contigs. More PS26 allelic transcripts for 
genes involved in either ribosomal or translational func- 
tions may be expressed in PS26 than in BC 8 thus lead- 
ing to a higher transcript difference between the 
libraries. 

Expression analysis of the ASGR-carrier chromosome 
linked genes in BC 8 tissue was used to identify tran- 
scripts specific to reproductive tissue. All but two 
ASGR-carrier chromosome transcripts showed constitu- 
tive expression in both vegetative and reproductive tis- 
sues. The one reproduction-specific transcript (the 
MADS box gene) did not map to the ASGR. The tran- 
script which could be mapped to the ASGR shows simi- 
larity to "hypothetical" proteins in both sorghum and 
rice containing a Transposase_24 domain. Previous 
sequencing of BAC clones linked to the ASGR have 




L 



Figure 6 Examples of expression patterns for ASGR-carrier 
chromosome linked sequences, a: most genes showed expression 
in all four organs tested (Ps26_c194: pi 604/p1 605). b: one gene was 
expressed in only ovary and anther (PS26_ c33813: pi 565/pl 566). 
RT(+): RT with reverse transcriptase; RT(-): RT without reverse 
transcriptase as DNA contamination control. 

v / 



shown a large number of both Type I and Type II trans- 
posons at the locus [50,13]; therefore, it is not surprising 
that we identified an ASGR-linked transposon transcript 
in our study. 

Conclusions 

Our data show that the combination of selecting specific 
reproductive tissues and sequencing with 454 high- 
throughput sequencing technology is a promising 
approach for identification of genes involved in different 
developmental events and that a need for longer tran- 
script contigs will be a requirement to allow for easier 
mapping of these transcripts. Given the rapid advance- 
ments in next-generation sequencing technologies that 
enable very deep sequence coverage and paired-end 
reads, it is likely that the fine tissue dissection requiring 
RNA amplification of starting materials now could be 
eliminated to favor longer transcript assemblies. 

Methods 

Plant materials 

Pennisetum squamulatum (PS26; PI 319196, 2n = 56) 
and backcross line 8 (BC 8 )-line 58were used for ovule 
collection. Compared with the BC 7 line which was used 
in previous studies [12], the BC 8 -line 58 contains only 
one alien chromosome from PS26, the ASGR-carrier 
chromosome [31]. P. glaucum (IA4X), P. purpureum 
(N37), 4 apomictic and 4 sexual plants from BC 8 -line 58 
(BC 8 is facultative thus it produces ~ 18% sexually 
derived offspring were used for assigning the candidate 
transcript fragments to the ASGR-carrier chromosome. 
Twenty-two individuals from a segregating Fx popula- 
tion between P. squamulatum and P. glaucum were 
used for mapping the transcript fragments to the ASGR. 

RNA isolation 

Young florets were dissected from small inflorescence 
sections whose anthers were at stages between premeio- 
sis and prophase, as determined by acetocarmine stain- 
ing of anther squashes. One group of florets was stored 
in RNALater® solution (Ambion, Austin, TX, USA) at 
4°C while the other group was processed for ovary 
clearing by methyl salicylate [51] to screen for the ovary 
developmental stage. Ovules from thirty cleared florets 



Zeng et al. BMC Genomics 201 1, 12:206 
http://www.biomedcentral.eom/1 471 -21 64/1 2/206 



Page 12 of 15 



were examined for each group. If the cleared sample 
showed AIs in less than 30% of the ovaries and the 
remaining ovaries were at an earlier developmental 
stage, then florets stored in RNALater® solution from 
the same section of inflorescence were used for ovule 
dissection. About 40 ovules per sample were collected 
and total RNA was extracted from the ovules with 
RNAqueous®-Micro Kit (Ambion). RNA integrity and 
quantity were analyzed with an Agilent 2100 Bioanalyser 
(Santa Clara, CA) at the Interdisciplinary Center for 
Biotechnology Research (ICBR) of the University of 
Florida. 

RNA amplification and ds-cDNA synthesis for Roche 454 
sequencing 

With total RNA as starting material, mRNA was ampli- 
fied by T7-based in vitro transcription following the 
manual of TargetAmp™2-Round aRNA Amplification 
Kit 2.0 (Epicentre, Madison, WI). Size range and quan- 
tity of the amplified mRNA were measured by both gel 
electrophoresis and Agilent 2100 Bioanalyser analysis. 
For each sample, an equal amount of amplified mRNA 
from the three biological replicates was pooled for ds- 
cDNA synthesis following the protocol developed by the 
Schnable lab [52]. Size-range and quantity of ds-cDNA 
were also analyzed by both gel electrophoresis and using 
the Agilent 2100 Bioanalyser before submitting the sam- 
ples for sequencing. 

454 sequencing and processing 

About 6 (ig of ds-cDNA from both PS26 and BC 8 was 
submitted to the Genome Sequencing Center at 
Washington University for 454-FLX sequencing. Sam- 
ples of cDNA were subjected to mechanical shearing 
(nebulization), size selected, and blunt-end fragments 
were ligated to short adaptors, which provided primer 
target sites for both amplification and sequencing. 
Sequencing files (Accession #SRA030528) were sub- 
mitted to the Sequence Read Archive at NCBI http:// 
trace. ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=studies. 
The Multifunctional Inertial Reference Assembly 
(MIRA) program [38] was used to process and assemble 
the sequences from each library. Adaptor sequences and 
low quality sequence reads were removed prior to 
assembly. The assembly was run as a de novo, 454 EST 
project with accurate assembly and polyA/T clipping. 
Each library of contig assemblies from PS26 and BC 8 
was converted to a database and analyzed with the 
BlastN program provided by the RCC (Research Com- 
puting Center) at the University of Georgia http://rcc. 
uga.edu. The PS26 library contigs were chosen as 
queries and the BC 8 library was chosen as the database. 
The BlastN analysis was performed with an E-value cut- 
off of < e" 100 . The BlastN output was parsed using an 



internal script such that only contigs with 100% identity 
over at least 100 bp were selected for further analysis. 

BLAST analysis of the selected contigs 

BlastX was used to analyze sequences mapping to the 
ASGR-carrier chromosome by searching against the 
NCBI (National Center for Biotechnology Information, 
http://www.ncbi.nlm.nih.gov/) databases. A BlastN ana- 
lysis was conducted on contigs without significant 
BlastX hits (e-value < e" 06 ) to search for similar ESTs 
from other species. The most significant EST hit with 
an e-value of at least <e~ 20 was used for BlastX query to 
search for putative encoding proteins. 

Mapping of identical PS26/BC 8 contigs to the alien 
chromosome and/or ASGR 

Fasta files containing sequences from contigs with 100% 
identity over at least 100 bp from both PS26 and BC 8 
libraries were generated. Alignment of each PS26/BC 8 
contig pair yielded sixty-one assemblies of PS26/BC 8 
contigs used as candidates for mapping to the ASGR- 
carrier chromosome. The 61 PS26/BC 8 contigs from 
were used as queries with BlastN against both the PS26 
and BC 8 MIRA-assembled databases at an E-value cutoff 
of <e~ 25 . The BlastN results were parsed and used to 
help estimate the 'uniqueness' of the contig within the 
transcriptome. Primers were designed based on the 
overlapping region of PS26 and BC 8 contigs, and in 
some cases included further 3' sequences for primer 
design if the contig was unique in both databases. When 
multiple contigs from each database showed high simi- 
larity to each other, primers were designed based on the 
region with the best polymorphisms to distinguish one 
from another. Primers were first tested for amplification 
with PS26, IA4X, N37 and 4 apomictic and 4 sexual 
plants from a segregating population of BC 8 . Primer 
pairs which did not amplify either IA4X or sexual BC 8 
individuals were used for further screening with apomic- 
tic and sexual FiS to test for linkage to the ASGR. 

For SSCP analysis a Bio-Rad Protean II system (Bio- 
Rad Laboratories, Hercules, CA) was used to separate 
fragments in a 1 mm thick 12% non-denaturing PAGE 
gel with 10% glycerol. PCR product (2 was mixed 
with 10 \il LIS loading dye (10% sucrose, 0.01% bromo- 
phenol blue, and 0.01% xylene cyanol FF), denatured at 
98°C for 10 min and cooled to RT for at least 10 min. 
Sample (10 (il) was loaded and the gel was run in at 200 
V for 20-22 hours at 25°C. Silver staining was used to 
detect the SSCP fragments. 

Expression patterns of transcripts mapped to the alien 
chromosome 

Total RNA was extracted from a panel of BC 8 tissues 
including vegetative (leaf, root), and reproductive tissues 



Zeng et al. BMC Genomics 201 1, 12:206 
http://www.biomedcentral.eom/1 471 -21 64/1 2/206 



Page 13 of 15 



at anthesis but before pollination (anther and ovary) 
with QIAGEN RNeasy® Plant Mini kit (QIAGEN, 
Valencia, CA) following the manufacturer's protocol. 
First-strand cDNA was synthesized following the manu- 
facturer's protocol of First-strand cDNA Synthesis kit 
(Invitrogen, Carlsbad, CA). RT-PCR reactions were per- 
formed using primer pairs which mapped to the ASGR- 
carrier chromosome in a total volume of 20 \i\ contain- 
ing 1 \A of first-strand cDNA, 1 (iM of each primer, IX 
PCR buffer, 1.5 mM MgCl 2 , 0.2 mM dNTPs, and 1 unit 
of JumpStart™ Taq DNA polymerase (Sigma, St. Louis, 
MO). Amplification of contaminating genomic DNA 
was tested by the inclusion of controls that omitted the 
reverse transcriptase enzyme from the cDNA synthesis 
reaction, e.g. no RT controls. The PCR reaction was 
denatured at 94°C for 5 min followed by 35 cycles of 
94°C denaturation for 30 seconds, annealing for 30 sec- 
onds at respective temperatures, and 72°C extension for 
1 min. RT-PCR products were separated on a 1.5% agar- 
ose gel and stained with ethidium bromide. Gel images 
were captured with the Molecular Imager Gel Doc XR 
System (Bio-Rad Laboratories). 

cDNA library construction 

Ovaries and anthers collected from apomictic BC 8 
around anthesis but prior to fertilization were frozen in 
liquid nitrogen. Total RNA was extracted with the 
RNeasy® Plant Mini kit (QIAGEN) and then poly A + 
RNA was purified from total RNA with Oligotex® 
mRNA Mini kit (QIAGEN) following the manufacturers 
protocols. Yield of mRNA was quantified with a Nano- 
drop spectrophotometer (Thermo Fisher Scientific Inc., 
Wilmington, DE). mRNA was used for double-stranded 
cDNA synthesis with ZAP-cDNA® Synthesis Kit follow- 
ing the manufacturer's protocol (Stratagene, La Jolla, 
CA). Ligations, packaging, titering of the packaging reac- 
tions, and plaque lifts were conducted following the 
manufacturer's protocol of ZAP-cDNA® Gigapack® III 
Gold Cloning Kit (Stratagene). 

cDNA library screening for target genes 

The apomictic BC 8 ovary and anther-enriched cDNA 
library was screened with a- 32 P labeled probes with 
transcripts mapping to the ASGR-carrier chromosome. 
The PCR fragments amplified from apomictic BC 8 geno- 
mic DNA with the primers used for assigning a frag- 
ment to the ASGR-carrier chromosome were diluted 
and labeled with a- 32 P by PCR in a total volume of 20 
The labeling reaction contained -0.1 ng primary 
PCR fragment, 1.25 unit Jumpstart Taq DNA polymer- 
ase (Sigma), 0.25 [iM of each primer, 0.5 mM dATP/ 
dTTP/dGTP mixture, 5 \A of a- 32 P-labeled dCTP (3000 
Ci/mmol) and 1 x PCR buffer (10 mM Tris-HCl, 50 
mM KC1, 1.5 mM MgCl 2 ). Probes were purified by 



passing through homemade Sephadex G-50 (Sigma) col- 
umns, which were assembled with Ultrafree®-MC Cen- 
trifugal Filter Units (Millipore, Bedford, MA). Pre- 
hybridization of the membranes in hybridization buffer 
(0.5 M sodium phosphate, 7% SDS, 1 mM EDTA, pH 
8.0) containing 0.1 mg ml" 1 salmon sperm DNA, which 
was denatured in boiling water for 10 minutes and 
cooled on ice before adding to the hybridization solu- 
tion, was conducted at 65°C for 4 h before addition of 
the labeled, denatured probe. Hybridization was con- 
ducted at 65°C overnight followed by three washes at 
the same temperature for 30 min each with the follow- 
ing buffers: 1) 1 x SSC, 0.1% SDS; 2) 0.5 x SSC, 0.1% 
SDS; 3) 0.1 x SSC, 0.1% SDS. After the final wash, 
membranes were wrapped with plastic film and exposed 
to x-ray film overnight at -80°C prior to manually devel- 
oping with Kodak® GBX Developer and Fixer (Thermo 
Fisher Scientific Inc). Autoradiographs were aligned 
with the respective plates to recover hybridizing plaques 
with sterile glass pipettes. Recovered plaques were 
released in tubes containing 1.0 ml SM phage buffer 
(according to the formula in the manual of ZAP- 
cDNA® Gigapack® III Gold Cloning Kit) and 20 [i\ 
chloroform (Sigma). After overnight elution at 4°C, 1 \i\ 
SM buffer of each recovered sample was used for PCR 
to verify positive signals. Since the primary screening 
was carried out with a high density of plaque clones, the 
recovered positive plaques were purified after secondary 
and tertiary screens at much lower densities. Single pla- 
ques showing positive hybridization signals were recov- 
ered in 500 \A SM buffer with 10 (il chloroform (Sigma) 
at 4°C. 

Sequencing and mapping of candidate cDNA clones to 
the ASGR 

In vivo excision of single plaque clones was conducted 
using ExAssist® helper phage with SOLR® strain follow- 
ing the protocol in the manual of ZAP-cDNA® Giga- 
pack® III Gold Cloning Kit (Stratagene). Single colonies 
containing the pBluescript double-stranded phagemid 
with the cloned cDNA insert were isolated and cultured 
in liquid Luria-Bertani (LB) medium containing 100 (ig 
mL" 1 ampicillin at 37°C overnight. An aliquot of each 
culture was further grown in freeze broth containing 
100 \ig mL" 1 ampicillin at 37°C overnight and then 
stored at -80°C before sending out for sequencing. 
Sequencing was conducted with M13 primers (Georgia 
Genomics Facility, Athens, GA). Vector and bad quality 
sequences were trimmed from the original sequences 
with VectorNTI Advanced 10 (Invitrogen) and primers 
were designed with VectorNTI using the high quality 
cDNA sequences. Primers were then tested with apo- 
mictic and sexual FiS for linkage to the ASGR as 
described above. 



Zeng et al. BMC Genomics 201 1, 12:206 
http://www.biomedcentral.eom/1 471 -21 64/1 2/206 



Page 14 of 15 



Blast2GO 

Annotation for each library was performed using Bias- 
t2GO software, http://www.blast2go.org/start_blast2go 
[39]. BlastX (database: GenBank nr/E-value cutoff: e" 06 ), 
GO term mapping (default values) and Annotation 
(database: b2g-2009 with default values) were used. 
Annotations were validated and augmented using 
ANNEX. Libraries were compared using the Fisher's 
exact test with FDR value of <0.01 or <0.05. 

Additional material 
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